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Abstract 

This paper studies the problem of nonparametric testing for the no-effect of a 
random functional covariate on a functional response. That means testing whether 
| ' the conditional expectation of the response given the covariate is almost surely zero 

jy^ . or not without imposing any model relating response and covariate. The response 
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and the covariate take values in possibly different separable Hilbert spaces. Hence 
the situations with scalar response or covariate will be particular cases. Our test 
is based on the remark that checking the no-effect of the functional covariate is 
equivalent to checking the nullity of the conditional expectation of the response 
given a sufficiently rich set of projections of the covariate. Such projections could be 
on elements from finite-dimension subspaces of the Hilbert space where the covariate 
I/"") | takes values. Then, the idea is to search a finite-dimension element of norm 1 that is, 

in some sense, the least favorable for the null hypothesis. With at hand such a least 
■ favorable direction, it remains to check the nullity of the conditional expectation 

. of the functional response given the scalar product between the covariate and the 

" selected direction. We follow these steps using a nearest neighbors (NN) smoothing 

approach. As a result, our test statistic is a quadratic form involving univariate 
NN smoothing and the asymptotic critical values are given by the standard normal 
law. The test is able to detect nonparametric alternatives, not only linear ones. 
^ ■ The responses could be heteroscedastic with conditional variance of unknown form. 

The law of the covariate does not need to be known. An empirical study with both 
simulated and real data is reported. The cases of functional response and functional 
or scalar covariate are considered. Our conclusion is that the test could be easily 
implemented and performs well in simulations and real data applications. 
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1 Introduction 



There has been substantial recent work on the methodology of regression analysis with 
functional data where predictors, responses, or both of them can be viewed as random 
functions. Functional data arise in many applications, the monograph of Ramsay and 
Silverman (2005) provides many compelling examples. In this paper we focus on the case 
where both the response and the predictor (or covariate) are random elements taking 
values in a space of functions. The functional linear model is the benchmark approach, 
see Chiou, Miiller and Wang (2004), Yao, Miiller and Wang (2005), Gabrys, Horvath 
and Kokoszka (2010) and the references therein. Recently, alternative nonparametric 
approaches have been considered; see Ferraty et al. (2011), Lian (2011), Ferraty, Van 
Keilegom and Vieu (2012). 

An important step in the statistical modeling is the goodness-of-flt of the model con- 
sidered, for instance the functional linear model. To our best knowledge only the papers of 
Chiou and Miiller (2007) and Kokoszka et al. (2008) investigate the problem of goodness- 
of-flt. Chiou and Miiller (2007) introduced diagnostics of the functional regression fit 
using plots of functional principal components scores (FPC) of the response and the co- 
variate. They also used residuals versus fitted values FPC scores plots. (The FPC are 
the random coefficients in the Karhunen-Loeve expansions.) It is easy to understand that 
such two-dimension plots could not capture all types of effects of the covariate on the 
response, such for instance the effect of the interactions of the covariate FPC. Kokoszka 
et al. (2008) used the response and covariate FPC scores to build a test statistic with x 2 
distribution under the null hypothesis of no linear effect. Again, by construction, the test 
of Kokoszka et al. cannot detect any nonlinear alternative. When little is known about the 
structure of the data, it is preferable to allow for flexible, nonparametric, alternatives for 
the goodness-of-fit test. Moreover, when proceeding to nonparametric estimation of the 
link between the response and the predictor, one should also check whether the predictor 
has an effect of the response or not. 

Formally, the statistical issue we address in this paper could be formulated as follows. 
Consider a sample of independent copies (Ui, Xi), • ■ ■ , (U n , X n ) of (U, X) where U and X 
takes values in some separable Hilbert spaces ~K\ and T-L 2 . Without loss of generality we 
may suppose that U has zero expectation. The problem is to build a statistical test of 
the hypothesis of no-effect of U on X, that is 

H : E(U\X) = almost surely (a.s.), (1.1) 

against the nonparametric alternative P[E(£7|X) = 0] < ljj Since %\ or %2 could be of 
finite dimension, for instance the real line, this framework covers all the common situations 
involving functional data. However, our focus of interest will be on the case functional 
response and functional covariate. 

*See for instance Parthasarathy (1967) for the construction of the expectation and conditional expec- 
tation of a Hilbert-space valued random variable. 
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The goodness-of-fit or no-effect against nonparametric alternatives has been very lit- 
tle explored in functional data context. In the case of scalar response, Delsol, Ferraty 
and Vieu (2011) proposed a testing procedure adapted from the approach of Hardle and 
Mammen (1993). However, their procedure involves smoothing in the functional space and 
requires quite restrictive conditions which make it difficult to apply to real data situations. 
Patilea, Sanchez-Sellero and Saumard (2012) and Garcia-Portugues, Gonzalez-Manteiga 
and Febrero-Bande (2012) proposed alternative nonparametric goodness-of-fit tests for 
scalar response and functional covariate using one dimension projections of the covariate. 
Such projection-based methods are much less restrictive and performs well in applications. 
To our best knowledge, no nonparametric statistical test of no-effect or goodness-of-fit is 
available when both the response and the covariate are functional. 

Our test is based on the remark that checking the no-effect of the functional covariate 
is equivalent to checking the nullity of the conditional expectation of the response given a 
sufficiently rich set of projections of the covariate. Such projections could be on elements 
of norm 1 from finite-dimension subspaces of the Hilbert space where the covariate takes 
values. Then, the idea is to search a finite-dimension element of norm 1 that is, in some 
sense, the least favorable for the null hypothesis. With at hand such a least favorable 
direction, it remains to check the nullity of the conditional expectation of the functional re- 
sponse given the scalar product between the covariate and the selected direction. Patilea, 
Sanchez-Sellero and Saumard (2012) used a similar idea with scalar responses. We follow 
these steps using a nearest neighbors (NN) smoothing approach. As a result, our new 
test statistic is a quadratic form involving univariate NN smoothing and the asymptotic 
critical values are given by the standard normal law. When the response is univariate, our 
statistic is related but different from the one introduced by Patilea, Sanchez-Sellero and 
Saumard (2012). By construction, the test is able to detect nonparametric alternatives. 
The responses could be heteroscedastic with conditional variance of unknown form. The 
law of the covariate does not need to be known. 

The paper is organized as follows. In section [2] we introduce the main notation and 
we derive a fundamental lemma for our approach. This lemma states that checking 
condition (11.11) is equivalent to checking the nullity of the conditional expectation of U 
given a sufficiently rich set of projections of X on elements of norm 1 from finite-dimension 
subspaces of H 2 - In section [3] we introduce the test statistic for testing of no-effect of X 
on U when U is observed. Our statistic is a quadratic form, based on univariate NN 
smoothing, that behaves like a standard normal random variable under Hq. We prove 
that, under mild technical assumptions, the induced test is consistent against any type of 
fixed alternatives and against sequences of directional alternatives approaching the null 
hypothesis at a suitable rate. The allowed rates are almost the same as those obtained in 
parametric model checks based on smoothing with univariate covariate, see for instance 
Guerre and Lavergne (2005). Clearly, our test procedure applies also to the case where 
the sample of U is not observed and has to be estimated, for instance as the residual 
of a regression. Under suitable regularity conditions ensuring that the sample of U is 
estimated sufficiently accurate, the test statistic will still have standard normal critical 
values. To keep this paper at reasonable length, the extension of our methodology to the 
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case of estimated responses will be investigated elsewhere. In section 14.11 we propose a 
simple wild bootstrap procedure to approximate the critical values of our test statistic with 
small samples and we report the results of several simulation experiments. In particular 
we compare our test with the one proposed by Kokoszka et al. (2008). We conclude that 
the test could be easily implemented and performs well in applications. The proofs are 
relegated to the appendix. 

2 A dimension reduction lemma 

In order to simplify the presentation and without loss of generality, hereafter we focus 
on the case where the Hilbert spaces "Hi and % 2 are both equal to the space of square- 
integrable random functions defined on the unit interval. 

Let us introduce some notation. For any p > 1, let S p = {7 G MP : ||7|| = 1} denote 
the unit hypersphere in M p . Let L 2 [0, 1] be the space of the square-integrable real- valued 
functions defined on the unit interval (•, •) denote the inner product in L 2 [0, 1], that is for 
any W U W 2 G L 2 [0, 1] 

(Wi,W 2 ) = / W 1 {t)W 2 {t)dt. 
Jo 

Let || • 11^2 be the associated norm. Hereafter 1Z = {pi, p 2 , ■ • • } will be an arbitrarily fixed 
orthonormal basis of the function space L 2 [0, 1], that is (pi, Pj) = Then the response 
and the predictor processes can be expanded into 

00 00 
U(t) =^u jPj {t) and X(t) = ^T l x j p j (t), (2.2) 
3=1 j=i 

where the random coefficients Uj (resp. Xj) are given by Uj = (U, pj) (resp. Xj = (X, P j)). 
For a fixed positive integer p and any W G L 2 [0, 1], G L 2 [0, 1] will be the projection 
of X on the subspace generated by the first p elements of the basis 1Z, that is 

^(t)^^-(t). 
j'=i 

By abuse we also identify with the p— dimension random vector (wi, ■ ■ ■ ,w p ). On 
the other hand, for any integer p > 1 and non random vector 7 = (71, ■ ■ • , 7 P ) G MP, we 
identify 7 with Y^=i ljPj{t) ^ -^ 2 [0; 1] an d hence we write (W, 7) = {W^\ 7) = Y^h=i x j7j- 
In the following we will also use (3 = J2JLibjPj(t) to denote a non random element of 
L 2 [0,1]. 

Our approach relies on the following lemma, an extension of Lemma 2.1 of Lavergne 
and Patilea (2008) and Theorem 1 in Bierens (1990) to Hilbert space- valued responses and 
conditioning random variables. For any 7 G S p , let F 7 denote the distribution function 
(d.f.) of the real-valued variable (^,7), that is F 7 (t) = P((X,7) < t), Vt G R. 
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Lemma 2.1 Let U,X £ L 2 [0,1] be random functions. Assume that E\\U\\ < oo and 
E(U) = 0. 

(A) The following statements are equivalent: 

1. E(U \X) = a.s. 

2. E[(U,E(U | (X 7 i)))] = a.s. \/p > 1,V 7 £ S p . 

3. E[(U,E{U | F 7 ((X, 7 ))})] = a.s. Vp > 1,V 7 £ S p . 

(B) Suppose in addition that for any positive real number s, 

E(||[/||exp{s||X||}) < oo. (2.3) 

If¥\E(U | X) — 0] < 1, then there exists a positive integer p such that for any integer 
P > Pa, the set 

A = {7 £ S p : E(U I (X,j)) = a.s. } = {7 £ S p : E(U \ F 7 ((X, 7 ))) = a.s. } 
has Lebesgue measure zero on the unit hypersphere S p and is not dense. 

Point (A) is a cornerstone for proving the behavior of our test under the null and the 
alternative hypotheses. Point (B) shows that in applications it will not be difficult to find 
directions 7 able to reveal the failure of the null hypothesis fll.ip since, under the very 
mild3 conditions, such directions represent almost all the points on the unit hyperspheres 
S p , provided p is sufficiently large. 

Let 

Q( 1 ) = E[(U ,E{U\F,((X n ))}}] (2.4) 
The following new formulation of Ho is a direct consequences of Lemma 12.11 above. 

Corollary 2.2 Consider a L 2 [0,1}~ valued random variable U such that E\\U\\ < 00. The 
following statements are equivalent: 

1. The null hypothesis ( fi. 1\) holds true. 

2. for any p > 1 and any set B p C S p with strictly positive Lebesgue measure on the 
unit hypersphere S p , 

Vp>l, maxQ(7) = 0. (2.5) 

■y£B p 



Hi X does not satisfy condition (|2.3|) . it suffices to transform X into some variable W £ L 2 [0,1] such 
that the a— field generated by W is the same as the one generated by X and the variable W satisfies 
condition 



5 



3 Testing the effect of a functional covariate 



We introduce a general approach for nonparametric testing the no-effect of a functional 
covariate X on a functional random variable U based on the characterization (12. 5ft of the 
null hypothesis. 

3.1 The test statistic 

In view of equation ( I2.5p . our goal is to estimate Q{^f). With at hand a sample of (U, X), 
define 

n(n — 1 ^-^ h 

where (•) = K (-/h), K(-) is a kernel, h the bandwidth, and F 1>n is the empirical d.f. 
of the sample (Xl,7), • ■ ■ , (X„, 7)H 

The statistic Q n (7) is related to statistics considered by Fan and Li (1996) and Zheng 
(1996) for checks of parametric regressions for finite dimension data. See also Patilea, 
Sanchez-Sellero and Saumard (2012) for the extension of this type of statistics to test- 
ing the goodness-of-fit of functional linear model. The statistics considered by all these 
authors are based on a Nadaraya- Watson regression estimator. Here we use the nearest 
neighbor (NN) approach of Stute (1984) and hence our new statistic is more in the spirit 
of the one introduced by Stute and Gonzalez Manteiga (1996) to test simple linear mod- 
els with scalar outcome and covariate and homoscedastic error term. Herein we allow for 
heteroscedasticity of unknown form and hence, in the particular case where U and X are 
scalar, we extend the framework of Stute and Gonzalez Manteiga (1996). 

The idea of using projections of the covariates was also considered by Lavergne and 
Patilea (2008); see also Bierens (1990), Cuesta-Albertos et al. (2007), Cuesta-Albertos, 
Fraiman and Ransford (2007). The extension of the scope to functional responses seems 
to be new. 

Under Hq, by the Central Limit Theorem (CLT) for degenerate U— statistics, for fixed 
p and 7 G S p , n/i 1//2 Q n (7) has asymptotic centered normal distribution. Here we use the 
CLT in Theorem 5.1 in de Jong (1987). We will show de Jong CLT still applies and the 
asymptotic normal distribution is preserved even when p grows at a suitable rate with 
the sample size. On the other hand, Lemma [2.11 -(B) indicates that if p is large enough, 
the maximum of Q (7) over 7 stays away from zero under the alternative hypothesis and 
this will guarantee consistency against any departure from Hq. 

The statistic Q n {l) is expected to be close to ^(7) uniformly in 7, provided p increases 
suitably. Then a natural idea would be to build a test statistic using the maximum of 
Qni'y) with respect to 7. However, like in the finite dimension covariate case, under Hq 

* Ties in the values (Xi,j), 1 < i < n, could be broken by comparing indices, that is if (Xi,j) = 
(Xj,j), then we define Ky tn ((Xi, 7)) < F-y_ n ((Xj,^)) if i < j. However, for simplicity in our assumptions 
below we will assume that the (Xi,"/)'s have continuous distribution for all 7. 
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one expects Q n {l) to converges to zero for any p and 7 and thus the objective function of 

the maximization problem to be flat. Therefore we will choose a direction 7 as the least 

favorable direction for the null hypothesis Ho obtained from a penalized criterion based on 

a standardized version of Q n (7); see also Lavergne and Patilea (2008) and Bierens (1990) 

for related approaches. More precisely, fix some 70 G L 2 [0, 1] that could be interpreted as 

an initial guess of an unfavorable direction for H . Let b j, j > 1, be the coefficients in 

b 2 
1 % 



the expansion of 70 in the basis TZ. For any given p > 1 such that Y2 P j=i ^oj > 0, ^ 



(p) _ (601 , • • • ,bop) ™ 

Let 

«£(7) = w(w ^ 1)fe E< Z7 «' ( F 7,n((^, 7)) - *U<*;, 7))) , 7e5", (3.1) 

be an estimate of the variance of nh l l 2 Q n (-). Given B p C 5 P with positive Lebesgue 
measure in 5 P that contains Jq \ the least favorable direction 7 for if is defined by 



7 n = argmax 



^/i 1/2 Qn(7)/^n(7) - ttnl| 7 ^ 7 (rt j 



(3.2) 



where I a is the indicator function of a set A, and a n , n > 1 is a sequence of positive real 
numbers decreasing to zero at an appropriate rate that depends on the rates of h and p 
and that will be made explicit below. Using a standardized version of Q n {l) avoids scaling 
a n according to the variability of the observations. Let us notice that the maximization 
used to define 7„ G B p C S p is a finite dimension optimization problem. The choice of 7^ 
will be shown to be theoretically irrelevant, it will not affect the asymptotic critical values 
and the consistency results. However, in practice the choice of 7q P ' > could be related to 
prior information of the practitioner on a class of alternatives. Since Q n {l) = Qn{— 7) for 
any 7 G S p , one could restrict the set B p to a half unit hypersphere like {7 G S p : 71 > 0}. 
One could restrict B p even more, and hence to speed optimization algorithms, when some 
prior information indicates a set of directions that would be able to detect alternatives. 

We will prove that with suitable rates of increase for a n and p and decrease for h, the 
probability of the event {7„ = 7q P ' ) } tends to 1 under H . Hence Q n {ln) /vn(j) behaves 
asymptotically like Q n (7o^) A*n(7o^) 5 even when p grows with the sample size. Therefore 
the test statistic we consider is 

Tn = nh l l 2 %^l. (3.3) 

Vniln) 

We will show that an asymptotic a- level test is given by I (T n > Zi_ tt ), where z\_ a is the 
(1 — a)-th quantile of the standard normal distribution. 
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3.2 Behavior under the null hypothesis 



In order to derive the asymptotic behavior of the statistic T n under null hypothesis, below 
we introduce a set of assumptions on the data (Assumption [D]), and on the kernel and 
the rates of h and p (Assumption |K]) . 

Assumption D 

(a) The random vectors (Ui,Xi), . . . , (U n ,X n ) are independent draws from the random 
vector (U,X) £ L 2 [0, 1] x L 2 [0, 1] that satisfies E\\U\\ 8 < oo. 

(b) For any p > 1 and any 7 £ S p , the d.f. F 7 is continuous. 

(c) 3 a 2 , Ci, C 2 > and v > 2 such that: 

(I) < a 2 < E((U 1 ,U 2 } 2 l{(u 1 ,u 2 )<c 1 } I X U X 2 ) almost surely; 

(II) E[\\U\\ V I X] < C 2 . 

(d) For any p > 1, 7^ £ B p C S p , B p are open subsets of S p and B p x p /_ p C B p r, 
VI < p < p' where P £ MP denotes the null vector of dimension p. 

The continuity assumption required in (b) is a mild assumption that simplifies the 
NN smoothing. Condition (c) will allow to prove that the variance of the statistics Q n {l) 
is bounded away from zero and infinity uniformly with respect to 7. The very mild 
conditions imposed on B p simplify the proofs for the consistency. These conditions are 
satisfied for instance when B p is a half unit hypersphere. 

Assumption K 

(a) The kernel K is a continuous density on real line such that K(x) = K(—x) and 
K(-) is non increasing on [0, 00). 

(b) h — > and nh 2 — > 00. 

(c) p > 1 increases to infinity with n and there exists a constant A > such that p\n~ x n 
is bounded. 

The first step to derive a test statistic is the study of the behavior of the process 
Qn{l), 7 £ B p , under Hq when p is allowed to increase with the sample size. The 
following key lemma is crucially based on a powerful combinatorial result due to Cover 
(1967) on the number of possible orderings of (X 1 , 7), • ■ ■ , (X n , 7) when 7 belongs to the 
whole hypersphere S p , and on exponential inequalities for [/—statistics. 
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Lemma 3.1 Under Assumptions^ and\K\ and if Ho holds true, 

sup \Q n (l)\ = P (n~ 1 h~ 1/2 p\nn). 

Moreover, ifv^(j) is the estimate defined in equation 

sup {1/^(7)} = Qp(1). 

We now describe the behavior of 7™ under Hq. A suitable rate a n will make 7„ to be 
equal to 7^ with high probability. Under the null, a n has to grow to infinity sufficiently 
fast to render the probability of the event {7™ = 7^} close to 1. We will see below that, 
for better detection of alternative hypothesis, a n should grow as slow as possible. Indeed, 
slower rates for a n will allow the selection of directions 7 n that could be better suited than 
7o^ for revealing the departure from the null hypothesis. The rate of p is also involved 
in the search of a trade-off for the rate of a n : larger p renders slower the rate of uniform 
convergence to zero of Q n (l), 7 £ B p , and hence requires larger a n . 

Lemma 3.2 Under Assumptions O E for a positive sequence ct n , n > 1 such that 
a n p~ x In -1 n — > oo ; 

Hln = 7? } ) -> 1, under H . 

The proof of Lemma T3.2I is similar to the proof of Lemma 3.2 in Lavergne and Patilea 
(2008) and hence will be omitted. The following result shows that the asymptotic critical 
values of our test statistic are standard normal. 

Theorem 3.3 Under the conditions of Lemma \3.2\ and if the hypothesis Hq in ( fl.Jj) holds 
true, the test statistic T n converges in law to a standard normal. Consequently, the test 
given by I(T n > zi_ ), with z a the (1 — a) — quantile of the standard normal distribution, 
has asymptotic level a. 



3.3 The behavior under the alternatives 

Our test is consistent against the general alternative 

ifi : F[E(U I X) = 0] < 1, 

that is the probability that the test statistic T n is larger than any quantile Zi_ ) tends to 
one under H\. This could be rapidly understood from the following simple inequalities: 

nh}/*Q n {%) 

J-n — 



= max \nh 1/2 Q n {-i)/v n (-i) - } } + a n I {% ^ P)} 

nh^QM nh^Q n (j) 
> max — —-^ a n > — — a n , V7 G B p C <S P , (3.4) 

7efip v n (j) v n (j) 
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with v n {^) defined in (13.1j) . Since Var((Ui, U2) \ Xi,X 2 ) > a 2 , it is clear that l/v n (j) = 
Op(l) for all 7. On the other hand, from Lemma \2. 11 there exists po and 7 £ -B Po such 
that the expectation of Q n {l) stays away from zero as the sample size grows to infinity 
and h decrease to zero. On the other hand, for any p > p and any n and h, clearly 
max 7g £ p Q n {l) > Qni'j), because B Po x p _ Po C B p . All these facts show why our test is 
omnibus, that is consistent against nonparametric alternatives, provided that p — > 00. 

To state the consistency result, let S(X) be some L 2 [0, l]-valued function such that 
E[5(X)] = and < E[||5(X)|| 4 ] < 00, and let r n , n > 1 be sequence of real numbers 
that decrease to zero or r n = 1, \/n. Consider the sequence of alternatives 

H ln : U = U° + rJ(X), n > 1, with U° £ L 2 [0, 1], E(U° \X) = 0. 

We show below that such directional alternatives can be detected as soon as r^nh 1 ^ 2 / a n 
tends to infinity. This is exactly the condition one would obtain with scalar covariate; 
see Lavergne and Patilea (2008). However, in the functional data framework, to obtain 
the convenient standard normal critical values, we need l/a n = o(p~ 1 In -1 n). Hence, 
the rate r n at which the alternatives H\ n tend to the null hypothesis should satisfy 
r 2 n nh}l 2 /{plnn} — > 00. 

Theorem 3.4 Suppose that 

(a) Assumption [PI holds true with U replaced by U°; 

(b) Assumption \K\ is satisfied and in addition nh A — > 00 and there exists a constant C 
such that \K(u) — K(v)\ < C\u — v\, Vu, v £ M; 

(c) a n /{p\nn} — > 00 and r n , n > 1 is such that r^nh 1 ^ 2 / a n — > 00; 

(d) E[5(X)\ = and < E[||5(X)|| 4 ] < 00; 

(e) there exists p and 7 £ B p C S p (independent of n) such that E[£(X) | (X, 7)] 7^ 
andVt £ [0,1], the Fourier Transform of~5(t,-) = E[S(X)(t) \ F 7 ((X, 7 )) = •] is 
integrable; 

Then the test based on T n is consistent against the sequence of alternatives H\ n . 

The additional Lipschitz condition on the kernel K(-) and the restriction on the band- 
width range in Theorem 13.41 (b) are reasonable technical conditions that greatly reduce 
the complexity of the proof of the consistency. The existence of a vector 7 such that 
E[5(X) I (X, 7)] 7^ is guaranteed by Lemma [2. 11 (B). In Theorem 13.41 (c) we impose a 
convenient mild technical condition on one of such vector. 
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4 Empirical study 



A simulation study was carried out to assess the behavior of the proposed methods under 
the null and with different types of effects under the alternative. For comparison with 
the procedure proposed by Kokoszka et al. (2008), we considered a sample size n = 40. 
The critical values of our procedure were approximated by a wild bootstrap procedure as 
described below. 

4.1 Bootstrap procedure 

The bootstrap sample, denoted by , 1 < % < n, is obtained as: U\ = ZiUi, 1 < 
% < n, where Zi, 1 < % < n are independent random variables following the two-points 
distribution proposed by Mammen (1993), that is, Zi = — (\/5 — l)/2 with probability 
(VE + l)/(2v / 5) and Zi = (Vb+ l)/2 with probability (v^ - l)/(2v%). 

A bootstrap test statistic is built from a bootstrap sample as was the original test 
statistic. When this scheme is repeated many times, the bootstrap critical value z\_ an 
at level a is the empirical (1 — a)— th quantile of the bootstrapped test statistics. This 
critical value is then compared to the initial test statistic. 

4.2 Simulation study 

The first situation we considered was a functional linear model, given by 



where X{ and €{ are independent Brownian bridges and ip is square-integrable over [0, 1) x 
[0, 1). The kernel ip was chosen to be ijj{s,t) = c ■ exp(t 2 + s 2 )/2, with c = under the 
null and c = 0.3 under the alternative. 

The well-known Karhunen-Loeve decomposition of the Brownian bridge provides a 
good approximation of the covariate function. Thus, the orthonormal basis of eigen- 
functions K = {a/2 sin(jTrt) : < t < 1, j = 1, 2 . . .} seems a good choice for our test 
statistic. Different possibilities for the privileged direction 7^ were considered. The di- 
rection 7q P ' ) = (1,0,. ..,0) G iS p generally provides a powerful test. Here we present the 
results for an uninformative direction, with the same coefficients in all basic elements. 
For the penalization we used the value a n — 1, which provides a good trade-off between 
the privileged direction and the direction maximizing the standardized statistic. 

To compute the statistic for each direction, we used the Epanechnikov kernel, K(x) = 
(1 — x 2 )I{| x |<i}. A grid of bandwidths was considered in order to explore the effect of the 
bandwidth on the power of the test. 

The number of basic components was p = 3. For the optimization in the hypersphere 
S p , a grid of 1200 points was used. For each original sample, we used 499 bootstrap 




1 < i < n 
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samples to compute the critical value. One thousand original samples of size n = 40 were 
generated to approximate the percentages of rejection. 

Figure [1] shows the empirical powers obtained for a grid of values of the bandwidth 
both under the null hypothesis of no-effect and under the functional linear alternative. 
We observe that the power is not very much affected by the bandwidth around a possibly 
optimal value. For purposes of comparison, the empirical power of the Kokoszka et al. 
(2008) 's test is also shown. These authors proposed a test of the functional linear effect, 
that is, a test specially designed to detect the alternative of a functional linear effect 
versus the no-effect. Our test provides similar or even better power than the Kokoszka et 
a/.'s parametric test in their ideal framework. The level is quite well respected for any of 
the considered bandwidths. 



Kokoszka's test 
Our test c=0.3 
Our test c=0 



"1 1 1 1 1 1 1 1 1 1 

1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 



Bandwidth 



Figure 1: Testing the null-effect versus a functional linear alternative. 



Another alternative was considered of the following type: 

U i (t)=P(t)X i (t) + e i (t), \<i<n 

where X± and e, are independent Brownian bridges (as in the previous situation) and (3 is a 
square-integrable function on [0, 1]. This is the so-called concurrent model studied in detail 
in Ramsay and Silverman (2005), where the covariate at time t, Xi(t), only influences the 
response function at time t, Ui(t). The function j3 was j3(t) = c ■ exp(— 4(£ — 0.3) 2 ), with 
c = under the null and c = 0.6 under the alternative. 

Figure |2] shows the power of our test under the concurrent alternative, in comparison 
with Kokoszka et a/.'s test. In this case, Kokoszka et a/.'s test is slightly more powerful 
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than ours. This is not necessarily surprising since the concurrent model is in a sense a 
degenerate functional linear model. 



- A- Kokoszka's test 
-o- Our test c=0.6 
-©- Our test c=0 



1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 



Bandwidth 



Figure 2: Testing the null-effect versus a concurrent model alternative. 



A completely nonlinear alternative was also considered. In this case a quadratic model 
of this type was generated: 

Ui(t) = H(Xi(t)) + €i(t), \<i<n 

where Xi and 6j are independent Brownian motion and Brownian bridge, respectively, 
and H2(x) = x 2 — 1 Since the covariate function is a Brownian motion, instead of the 
Brownian bridge of the previous situations, the basis was chosen as the orthonormal basis 
of eigenfunctions of the Brownian motion. 

Figure E] shows the percentages of rejections under the null and under this quadratic 
alternative for a range of bandwidths. The power of the Kokoszka et afs test is also 
plotted. As expected, Kokozska et a/.'s test, which was designed to detect only linear 
effects, is not powerful under this quadratic alternative. 
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0.16 0.19 



Kokoszka's test 
Our test c=1 
Our test c=0 



1 1 1 1 

0.22 0.25 0.28 0.31 



A — 

<3>-- 

1 

0.34 0.37 



Bandwidth 



Figure 3: Testing the null-effect versus a quadratic alternative. 



5 Appendix: technical proofs 

In this section c, c±, C, C\, ... denote constants that may have different values from line to 
line. Recall that if X = Y^=i x jPji then XW = Y7j=i x jPj- 

Proof of Lemma I2.ll (A) We have 

E(U\X) = E((U,pj) | X) = 0, Vj > 1 

& E((U 7Pj )\X<*)) = o,Vj>l,Vp>l 

& E({U, Pj ) | (X, 7 » = 0, Vj > l,Vp > 1,V 7 e S p 

& E(U | (X,j)) = 0,Wp> 1,V 7 G S p 

& E(U\F 7 ({X,j)))=0, Vp>l,V 7 G5 p 

The first and the fourth equivalence in the last display are due to the fact that 1Z is a basis 
in L 2 [0, 1]. Next, note that by Cauchy-Schwarz inequality Vj, E\(U, pj)\ < E\\U\\ < oo. 
Thus the second equivalence in the last display is guaranteed elementary properties of 
the conditional expectations and the Doob's Martingale Convergence Theorem, while the 
third equivalence is given by Lemma 2.1-(A) of Lavergne and Patilea (2008). For the last 
equivalence recall that for any random variable Y with d.f. F, P(F _1 o F(Y) ^ Y) = 
where = {y : F(y) > t},V0 < t < 1; see for instance Proposition 3, Chapter 

1 in Shorack and Wellner (1986). Deduce that E(U \ (X, 7 )) = E(U | F 7 ((X, 7 ))). To 
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complete the proof of part (A) it suffices to note that 

E[{U,E(U\{X,j)))) = E[||E(f/| (X, 7 »|| 2 ] 

= E[||E(f/|F 7 ((X, 7 )))|| 2 ] 
= E[{U,E{U\F 7 ({X,j))})]. 

(B) First note that A C f]j>i A? where 

Aj = {^eS p : E((U, Pj ) | (X,j)) = a.s. }. 

Now, if F[E(U | X) = 0] < 1, then there exists j > 1 such that F[E((U, pj) \ X) = 0] < 1. 
For any arbitrarily fixed j > 1, Lemma 2.1 in Patilea, Saumard and Sanchez (2012) allows 
to deduce that there exists p > 1 such that, for any p > p , Aj has Lebesgue measure 
zero on S p and is not dense. Since A is included in any Aj, the conclusion follows. ■ 

Lemma 5.1 Let K be a density satisfying Assumption^- (a) and assume that h — > and 
nh — > oo. Lei 

Sni = -, TTT K \~~r\ a7ld S n = ~ ^ Sni- 

(n — l)n \ nh I n ^— ' 

l<7<n, l<j<n 

T/ien exists constants c\,C2 such that for sufficiently large n 

< c\ < min SVw < max S n i < C2 < oo. 

l<i<n l<?<n 

Moreover, S n — > 1. 

Proof. Clearly that — 5 n — > where 

n n 2 /i \ n/i 

l<*)i<n v 

If [a] denote the integer part of any real number a, we can write 

./l/n A/n \ nh J 

{n+1)/n . 1/h+1/nh - t /h (\ n t + nzh]-[nt]\ , , r , 

/ if M -J — ^ ctedt [z = (s-t)//i] 

l/n Jl/nh-t/h \ nh J 

(n+l)/n pl/h+l/nh-t/h 

/ K (z) dzdt + o(l) 

l/n Jl/nh-t/h 
l/h rl+l/n-zh 

/ di-fif (*) <fe + o(l) [Fubini] 

l/h Jl/n-zh 

-> 1, 
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where the order o(l) of the reminder on the right-hand side of the third equality could 
be obtained as a consequence of the fact K is symmetric and monotonic. Hence S n — > 1. 
Similarly, we can write 



nh 



(l~i)/nh 
l/h+(l-i)/nh 

(l-i)/nh 



[z = (t- i/n)/h] 



K(z)dz + o(l). 



Deduce that 



where maxi<j< n {|r n J + |r ni |} = o(l). The result follows. 



K(z)dz + r ni < S ni < / K(z)dz + r 



One of the ingredients we will use for the proof of Lemma 13.11 is a moment inequality 
for [/—statistics presented in Lemma \h. 21 below and due to Gine, Latala and Zinn (2000). 
To state the result we will use, let us introduce some notation. Let Z\, • ■ ■ , Z n be inde- 
pendent random variables (not necessarily with the same distribution) taking values in a 
measurable space (Z, T). Let hij(-,-), 1 < i,j < n be real-valued measurable functions 
on Z 2 such that hij(zi, Zj) = hj :i (zj, z t ) and ¥\h^j{z^ Zj)] = 0, VI < i, j < n, Wz iy Zj. The 



functions h i: j could be different for different values of n. Define 



A n = max \\hij(-, 



, B n = max 

j 



i-.i 



and 



D n = S u V lEY,h i , j (Z i ,Z j )f i (Z i )g :j (Z j ) : E^/ 2 ^) < 1, e£ < 1 1 ■ 

I i,3 i j ) 

The following result is simplified version of Theorem 3.3 in Gine, Latala and Zinn (2000). 

Lemma 5.2 There exist an universal constant L < oo (in particular, independent on n 
and the functions hij) such that 



y~] h n (Zi, Zj) > t f < Le: 



1 ft 2 t t 2 ' Z t 1 / 2 

■ — mm — — 



L V C 2 ' D n ' r 2 / 3 ' A 1 * 

\ n " J->n iln 



/2 



Vt > 0. 



Let 7 6 5 P and let arbitrary collection of non-random points in 

L 2 [0, 1]. Consider Zi, ■ ■ • , Z n independent random variables with values in L 2 [0, 1] such 
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that for each 1 < i < n the law of Zi is the conditional law of Ui given — Xj. We will 
apply Lemma 15.21 with h^i = and for 1 < i ^ j < n 

A'/-, ■'/-.;) = \^ M2 K h (F^ n (( Xi , 7 )) - F^ n (( Xj , T ))) , (5.1) 

where Zi = Zil^^ <M ^ — K[Zil^n <M y\, M > is some constant (that we will allow to 
increase with n)J| Here F JiU is the empirical d.f. of the sample (^1,7), • • • , (x n ,,7). The 
functions hij(-,-) vanish outside the rectangle [— 2M, 2M] x [— 2M, 2M]. The following 
lemma provides upper bounds for the quantities A n to D n in this setup. The bounds are 
independent of the collection xi, ■ ■ ■ , x n G L 2 [0, 1], and of p > 1 and 7 G S p . 

Lemma 5.3 Under the conditions of Lemma \3.1\ for hij defined as in Ii5.1\) 

1-^- linn ^2 - ^ ~° ^ " - ^ 



n(n-l)/i' ""nW " " n 2 /»M 4 ""nM 2 ' 
/or some constant c depending only on the upper bound ofE(\\U\\ 2 \ X) and j K 2 . 
Proof. The bound for A n is obvious. For C 2 note that 



-1 



E[^.(Z,,^)] = ^ E{E[(Z, ^/[^(^((x^)) - F 7 , n ((x 3 -, 7 )))} . 

By Cauchy-Schwarz inequality and triangle inequality and recalling that Zi is distributed 
according to the conditional law of Ui given Xi — Xi, 



E [(Zi ,Zj) 2 ] < 16E ||Zi|| 2 E \\Zj 



|2 



< 16C 2 



for any constant C that bounds from above E(||[/|| 2 | X), see Assumption iDi- (c) . Finally, 
note that 

and apply the second part of Lemma f5. ll to derive the bound for C 2 . To derive the bound 
for B 2 recall that hij(Zj,z) vanishes for \z\ > 2M, use again Cauchy-Schwarz inequality 
and triangle inequality and the first part of Lemma 15.11 For the bound of D n , using 
Cauchy-Schwarz inequality and the independence of Zi and Zj, we can write 

Ej^hijiZi^MZiMz,) <J2^^z^ 



* E ^ g ' ,l Wfa. 7» -F w to, 7))) 



< 1^1! 

- M 2 



§Note that in particular the E[^jlr|| Zi |i <M i] coincide with the values IE[C/il{ || ^ || < a/} I Xi = x i\- 
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where C is such that E(||£/|| 2 | X) < C and K, is the matrix with elements 



K 



K ((i — j)/nh) /[n{n — l)h], i ^ j, and /Q 



0. 



and |||/C|||2 is the spectral norm of /C. By definition, |||/C|||2 = sup ugR n uj l ||/Cu||/||u|| and 
\u'JCw\ < \\\JC\\\2\\u\\w\\ for any u,w G MP. By Lemma [5.11 for any u G IR n , 



\K,u\ 



= E E 

i=i \j=i,j^i 

n / n 

£ E E 

i=i V^ij^i 



< u 



max 

KK11 



K h ((i-j)/nh) _ 
hn(n — 1) 

K h ((i-j)/nh) 
hn(n — 1) 

n 

E 



E 



K h ((i-j)/nh ) 
h n(n — 1) 



K h {{i-j)/nh) 
h n(n — V\ 



^ -2|| ||2 

< cn \\u\\ , 



(5.2) 



for some constant c > 0. The bound for D n follows immediately. 



Another ingredient is an upper bound for the number of different possible orderings in 
the sample 7), ■ • • , (X n , 7) when 7 belongs to the unit hypersphere in MP (obviously 
the same number is obtained if 7 is allowed to belong to the whole space IR P ). Let 
Xi, • • • , x n a collection of n points in W and let it be a permutation of the set of integers 
{1, 2, ■ ■ ■ ,n}. Following Cover (1967), we shall say that 7 G S p induces the ordering n if 

(aV(i),7> < (^(2), 7) < ■•• < («7r(n),7)- 

Conversely, the ordering tt will be said to be linearly inducible if there exists such vector 
7. The following result is due to Cover (1967). 

Lemma 5.4 There are precisely q(n,p) linearly inducible orderings of n points in general 
position in MP, where 



p-i 



q(n,p) = 2 y ^S n , k = 2 



fc=0 



Yl i+ v 

2<i<n-l 2<i<j<n-l 



(p terms), 



where SV^fc is the number of the (n — 2)1/ (n — 2 — k)\k\ possible products of numbers taken 
k at a time without repetition from the set {2, 3, • • • , n — 1} 

By Lemma [5.41 we obtain a simple upper bound for q(n,p) when n > 2p, that is 

q(n,p) < 2[1 + n 2 + ■ ■ ■ + n p ] < n p+1 . (5.3) 
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Proof of Lemma 13.11 Fix M that depends on n in a way that will be specified below. 
Let 

where U M ,% = ^l{||c/ i ||<M} - E^-l^^n^}]. We can write 



... to In n 

P(BUp|gM,»(7)l>-^ 



7G5P 



E 



. _ , s , tp In n 
P sup Q Wjb 7 > 

76<S p nft 1 ^ 



-^1, • • • j X n 



In view of Lemma T5 .41 for any n,p, given X%, ■ • • , X„ there exists a set (9 np C MP with at 
most n p elements, that depend on X\, • ■ ■ , X n , such that 

SUp \Q M ,n(l)\ = SU P \QM,n(l)\- 

Let b n = M~ 2 n~ 1 h~ 1 / 2 phin. By Lemmas 15.21 and 1 5.41 deduce that there exists an universal 
constant L such that for any t > 0, 



m / i ^ / m tp In n 
P sup Q Mjn 7 > Ji— f 



• • ■ ,X n ) < P(|M- 2 Q M , n ( 7 )| > tb n \X 1 ,--- ,X n ) 



< max{L, 1} exp 



(p + 1) Inn — — min 



2/3 



,4 



1/2 



Now, take M = n 1 / 4 a for some (small) a > and notice that the exponential bound in 
the last display is independent of X%, ■ ■ ■ , X n and tends to zero for any t. Deduce that 

sup \Q M ,n{i) \ = ¥ (rT l hT ll2 phin) 

765? 

Next we show that sup 7g5P \Q n {l) ~ Qm,u{i) \ = op(n _1 /i _1 ' 2 £>lnn). Let 

Rln{l)= , 1 ^ V (^M,i,^-^M J )^(i 71 7 ,n((X,7))-i 71 7,n((^,7))), 7^", 

n[n — 1 /i 

and i?2n(7) = <5n( 7 ) - Qm,u{i) ~ 2R in(l)- We have, 

Esup|i? ln ( 7 )l < CrtdlC/M^IIII^-^-ll) <2Ch- 1 E(\\U i \\)E(\\U j -U Md \\). 

7 

By Holder inequality and Chebyshev inequality 

E(\\Uj - U M j\\) < 2E 1/m [||^-|| m ]P (m " 1)/m [\\Uj\l > M] < 2E[\\U j \\ m ] M 1_m . 

Now, to deduce that i?i n (7) is uniformly negligible, it suffices to note that under Assump- 
tion [K]-(b), for m > 7 and a sufficiently small, 



M l-m = n (l-m)(l/4-a) = Q ^-l/^]^) 
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Clearly, sup 7 |-R2n(7)| is of smaller order than sup 7 |i?i n ,(7)|. 

For the inverse of the variance estimator, for any 7 G S p let us define 

= n(n - l)h ^ {Uu (FyAiXi, 7)) ~ F,, n ((X s , 7))) ■ 

Using Holder inequality, Chebyshev inequality and Cauchy-Schwarz inequality, 

Esup|^( 7 )_^ n(7) | < Ch -i E ( (f4 u.f 1{{UuU . )2>N} ) 

7 

< h- l ¥} /s [{Ut, Uj} 2s ] F {s - 1)/s [{Ui, Uj) 2s > N s ] 

< h- l ¥? [||^-|| 2s ] N l ~ s . 

Take s = 4, N = n 1//4 and deduce that the right bound in the last display tends to zero. 
On the other hand, we apply Hoeffding (1963) inequality for [/—statistics to control the 
deviations of v^ n (j) — E[^ n ( 7 ) | X\, ■ - • , X n ] conditionally on Xi, ■ ■ ■ , X n . For any fixed 
7 we have 

P (^ 1/2 ^l^,n(7) - E[^,„(T) I X 1; • • ■ ,X n ]| > t I X 1; ■ • • ,X n ) 



< 2exp 



[n/^n- 1 * 2 



2[r 2 + K 2 (0)AT^ 1 /2t/ 3 ] 



where r 2 is the variance of a term in the sum defining hv 2 Nn {^() — M[hv% ^(7) | X 1; ■ • • , X n ] . 
Take t = n 1 / 2 "^ for some small c > and note that t 2 < C for some constant independent 
of 7 and /1. In the similar way we did for Qm^j), applying Lemma [5.4[ we obtain an 
exponential bound for the tail of v 2 N ^(7) — ~E\v% n (j) \ Xx, ■ ■ ■ ,X n ] given X\, ■ ■ ■ ,X n 
uniformly with respect to 7. This bound is independent of Xx, • • • ,X n . Finally integrate 
out Xi , ■ ■ • , X n and deduce that 

sup |^ n (7) - E[^ n (7) I X u ■ ■ ■ ,X n } \ = op(l). 

7 

It remains to note that Assumption [D]-(c) and the first part of Lemma [57X1 guarantee that 
E,\vf ln ('-f) I Xx, ■ • • ,X n ] stays away from zero. Gathering the results we conclude that 
l/^n(7) is uniformly bounded in probability. Now the proof is complete. ■ 

Proof of Theorem 13.31 By Lemma [3T2l if suffices to prove the asymptotic normality of 
the test statistic T n defined with 7 = 7q P ' ) . The proof of this asymptotic normality is based 
on the Central Limit Theorem 5.1 of de Jong (1987). We will apply the result of de Jong 
conditionally given the values of the covariate sample. Let xx, • ■ • ,x n be an arbitrary 
collection of non-random points in L 2 [0,1]. Consider Zx, • • • , Z n independent random 
variables with values in L 2 [0, 1] such that for each % the law of Z, is the conditional law of 
Ui given Xi = Xj. Let F^ P ) n (-) be the empirical d.f. of the sample (xx, 7o ), ■ ■ • , (x n , To ), 
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and 

Wu= , _ (ZiJ^K^i^), l<i^j<n, W u = 0, l<i<n. 



% 3 

n(n 



Hence Q n (lo^) = J2ijWij and vl(^) = 2n(n — l)h ^ ,• Wfy A crucial remark that 

is used several times in the following is that the elements of the matrix (Kh,ij('f^)) are 
the same as those of matrix (Kh{{i — j)/nh) up to permutations of lines and columns. 
Following the notation of de Jong (1987), let 



n n 2 (n-l) 2 h 2 



4 = E(W[j) = E[{Ui, U,Y | X, = Xi , X 3 = x 3 
and a 2 (n) = 2 Y^ijtj a ij- Since 



E[(Ui,Uj) 2 | Xi,- ■ ■ ,X n ] = E[(Ui, Uj} 2 | X h Xj} < E[||^|| 2 | XjE[||f/J 2 | X,], 

and K[(Ui,Uj) 2 \ Xi,Xj] is bounded away from zero by Assumption iDl-(c). deduce that 
there exist positive constants c and c such that 

J^Ljhf) < 4 £ ^ K U^)- (5-4) 

Apply Lemma [5.11 with K replaced by K 2 and deduce that 



< max > of.- = max > K 2 ((i — j)/nh) < 

~ Ki<n ^ 13 Ki<n ^ " ' ~ 



n 3 h i<i<n ^ 13 i<i<n ^ ftVV ;/ n 3 h' 

for some constants c\ and c<i- Moreover, there exist constants d and c' such that 

dn- 2 h- x < a(n) 2 < c'n- 2 h-\ 

It follows that 

n 

a(ny 2 max YV- = 0(n _1 ), 

Ki<n i! — ' J 

and thus Condition 1 in Theorem 5.1 of de Jong (1987) holds true as soon as n(n) = 
o{n l l 2 ). For checking Condition 2 in Theorem 5.1 of de Jong (1987), let us use Holder 
inequality with p = u/2 and q = vj[y — 2), with v given by Assumption lDl-(c)-(ii). and 
Markov inequality to get, for some constant C, 



n^ 2 Wil {a -, lWijl>K{n)} ] < E^l^f ]P (l/ - 2)/i/ [^ 1 |W,il > «(n)] < C«(n) 



-(1,-2)/*, 



That shows that Condition 2 of Theorem 5.1 of de Jong holds true with any n{n) tending 
to infinity. Finally, let ■ • • denote the eigenvalues of the matrix (c^). To check 
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Condition 3 of de Jong, use the upper bound of cr^ in (15.41) to deduce that there exists a 
constant C (independent on n and i) such that 



(P)^ 

i 



Next, note that if £ denotes the nxn matrix with generic element cr^, following the lines 
of equation (15.2p and using equation (15 3J) , for any u G M. n , 



|£dl 2 < lldl 2 



< Ci u 



max 

Kt<n 



E 



a 



ij 



max 

KKn 



E 



K h ((i-j)/nh ) 
hn(n — 1) 



^ — 2|| l|2 

< c 2 n kt , 



for some constants c\, c 2 > 0. Deduce that 



(5.5) 



/m 2 1 



a(n) max ^ < — — - -> 0, 

l<2<n c n 

and thus Condition 3 of de Jong (1987) holds true. To complete the proof of the asymptotic 
normality of the statistic T n = nh^Q^jQ )/£Vi(7o ) given the covariate values, note that 



a 2 (n)=E[Q 2 n (^ p) )\X 1 =x 1 ,--- ,X n 



E[v* n ( 1 ^)\X 1 = x 1 ,--- ,X n 



X r 



X r , 



n(n — l)h 

Moreover, by direct standard calculations it can be shown that the variance of 



n 



^i) E ^Ik,^ 



is of rate 0(h n ) = o(l). Deduce that 



^(7o W )/n(n - l)/i 

= p( } 



(5.6) 



) X n x n 



given Xi = xi, ■ ■ ■ , X n = x n . The asymptotic normality of T n given Xi = x±, 
is a consequence of Theorem 5.1 of de Jong and equation (15. 6p . The proof is complete. ■ 

Proof of Theorem 13.41 The proof is based on inequality (13.41) . Since E((L r i,[/ 2 ) 2 | 
Xi,X 2 ) > cr 2 + r*{6(Xi), 5(X 2 )) 2 , clearly the variance estimate v 2 (j) stays away from 
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zero for all 7. On the other hand, by Cauchy-Schwarz and the property of the spectral 
norm for matricis, 

l<i j<n 



< hi/c 2 iii 2 e 11^ 



4 

«y 11 > 



Ki<n 



where /C 2 is the matrix with entries n~ 2 /i _1 i^(F n ^((Xj, 7)) — F n 7y({Xj, 7))). By the ar- 
guments used in equation (15 .5p . |||/C 2 ||| 2 = Op(r?, _1 ). This together with the finite fourth 
order moment condition for 5(-) imply that ^(7) is bounded in probability. Hence it suf- 
fices to look at the behavior of Q n {l)- By Lemma I2TT] - (B) there exists po and 7 G B Po C S po 
(p and 7 independent of n) such that E[5(X) | (X, 7)] 7^ 0. Hereafter, 7 is supposed 
to have this property. Let Vi = Fy((Xi,j)), so that V\, ■ ■ ■ ,V n are independent uniform 
variables on [0, 1]. Next introduce V n i = F n ^((Xi^)) and 

A n = sup \V ni -Vi\< sup \F n iy(t) - Fy(t)\. 

l<i<n teR 

Note that for any s6l, 

jsVni J.sVi 



and A n = P (n- 1 / 2 ). 
We can write 



Qnil) = 1 1U ^(^,^)^(K,-V; J )) 

n(n — l)n z — ' J 



rafra — l)/z 
=: Qon(l) + 2r n Qi n (7) + r 2 Q 2ri (7). 
Since 7 is fixed, Qon(7) = Op(n _1 /i _1 / 2 ) (cf. proof of Theorem 13.31) . Let 

oi„(7) = ^ E <ora>^-^)- 

First we show that Qi n (7) — Q* n (7) = °p(1)- If K satisfies a Lipschitz condition, 

CA CA 

\Qm{i)-Q\ n m<^ E iiu?ini*ran = ^pp(i) = ap(i). 

l<i^j'<n 
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Next, the [/—statistic Q\ n {"i) can be decomposed into a degenerate [/—statistic of order 
2 with the rate Op(/i _1 n _1 ) = Op(n _1//2 ) and the sum average of centered variables 



- J2 (VfMKXj^KhiVi-Vj) I Vi\). 



n 

l<i<n 

Hence it suffice to bound the variance of the terms in the sum. We can write 

v 2 = mulmx j )h- l K h {V l -V 3 )\V^) 2 } 

< EUU^limx^K^ - Vj) I ^|| 2 } 

< dEHmSiX^KhiVi - V 3 ) I V^]|| 2 }, 

for some constant c > larger than E{||[/j || 2 | X{\. Next we show that the map v i— > 
E[5(Xj)h~ l Kh(v — Vj)] is squared integrable. Let S(t,v) = E[S(Xj)(t) \ Vj = v ] and note 
that < //[oii x roi] \8(t,v)\ 2 dvdt < oo. Then using the Inverse Fourier Transform for K 
we have for any t 

ElSiX^h^Khiv-Vj)} = E 



6(t, Vj) j exp{ts(v - Vj)}T[K]{hs)ds 
/ exp{isv}F[5{t,-)]{s)F[K](hs)ds. (5. 



Take absolute value and deduce that 

2 



/ E 2 [5(X j )(t)h~ 1 K h (v — Vj)]dvdt < [ ( [ \T[S(t, -)](s)| ds\ dvdt 

</[0,l]x[0,l] i[0,l]x[0,l] \«/K / 

< I [ \T[S(t,-)]{s)\ 2 dsdvdt 

J [0,1] X [0,1] Jr 



[0,1] x [0,1] JR 



.— .2 

dvdt. 



Since the V^'s are uniformly distributed, we can deduce that v 2 is bounded and thus 
Qm{l) = O^n- 1 ' 2 ). Now, let 

Q ' 2n ^ ) = ^h ? (5(X t ),5(X 3 ))K h (V m -V n3 ), 

l<i ,j<n 



n 2 h 
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It is easy to check that 
n — 1 



QL(rr) 



n 



C2„(t) = Q'Hi) 



n 



n 2 h 



■ Ow>(n h 



=Op(l). 



8=1 



Next we have to show that Q r 2 n {j) — Q'inil) = °p(1)- If K satisfies a Lipschitz condition 
and nh A — > oo, by Cauchy-Schwarz inequality, for some constant C > 



a a 

IQLii) - Q'Lm < V 



Ki<n 



Conclude that Q2n(7) -QL.Gt) = °p(1); so that is suffices to investigate Qlnil)- It i s eas Y 
to show that the variance of Q^nil) tends to zero, so that it remains to show that the 
expectation of Q^nil) s ^ a Y away from zero. From the representation (15. 8p and repeated 
applications of Fubini's theorem we get 



E [{8(X i ),8{X j ))h~ l K h (V i - Vj] 
E((6(X i ),E[6(X j )h- 1 K h (y i - Vj) | X,])) 

E[S{Xi){t) [ exp{tsV i }J r [5(t,-)}(-s)J r [K}(hs)ds) dt 



[0,1] V JR 

[ \\F[5{t,-)]{s)\\ 2 F[K}{hs)dsdt. 

r [0,l] Jr 

By Lebesgue dominated convergence theorem and Plancherel theorem, 

nQW)]^ I f mv)\ 2 dvdt. 

J[o,i] 

Deduce that P[c _1 < Q 2n (l) < c] ->> 1 for some constant c > 0. Gathering the results 
conclude that for any C > 0, P[T n > C] -»• 1. ■ 
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